Search results for "computer and information sciences"
showing 10 items of 1328 documents
Bayesian subcohort selection for longitudinal covariate measurements in follow-up studies
2016
We consider planning longitudinal covariate measurements in follow-up studies where covariates are time-varying. We assume that the entire cohort cannot be selected for longitudinal measurements due to financial limitations and study how a subset of the cohort should be selected optimally in order to obtain precise estimates of covariate effects in a survival model. In our approach, the study will be designed sequentially utilizing the data collected in previous measurements of the individuals as prior information. We propose using a Bayesian optimality criterion in the subcohort selections, which is compared with simple random sampling using simulated and real follow-up data. This study ex…
Spatial noise-aware temperature retrieval from infrared sounder data
2020
In this paper we present a combined strategy for the retrieval of atmospheric profiles from infrared sounders. The approach considers the spatial information and a noise-dependent dimensionality reduction approach. The extracted features are fed into a canonical linear regression. We compare Principal Component Analysis (PCA) and Minimum Noise Fraction (MNF) for dimensionality reduction, and study the compactness and information content of the extracted features. Assessment of the results is done on a big dataset covering many spatial and temporal situations. PCA is widely used for these purposes but our analysis shows that one can gain significant improvements of the error rates when using…
How Do Software Companies Deal with Artificial Intelligence Ethics? : A Gap Analysis
2022
The public and academic discussion on Artificial Intelligence (AI) ethics is accelerating and the general public is becoming more aware AI ethics issues such as data privacy in these systems. To guide ethical development of AI systems, governmental and institutional actors, as well as companies, have drafted various guidelines for ethical AI. Though these guidelines are becoming increasingly common, they have been criticized for a lack of impact on industrial practice. There seems to be a gap between research and practice in the area, though its exact nature remains unknown. In this paper, we present a gap analysis of the current state of the art by comparing practices of 39 companies that …
The Average State Complexity of the Star of a Finite Set of Words Is Linear
2008
We prove that, for the uniform distribution over all sets Xof m(that is a fixed integer) non-empty words whose sum of lengths is n, $\mathcal{D}_X$, one of the usual deterministic automata recognizing X*, has on average $\mathcal{O}(n)$ states and that the average state complexity of X*is i¾?(n). We also show that the average time complexity of the computation of the automaton $\mathcal{D}_X$ is $\mathcal{O}(n\log n)$, when the alphabet is of size at least three.
Disentangling Derivatives, Uncertainty and Error in Gaussian Process Models
2020
Gaussian Processes (GPs) are a class of kernel methods that have shown to be very useful in geoscience applications. They are widely used because they are simple, flexible and provide very accurate estimates for nonlinear problems, especially in parameter retrieval. An addition to a predictive mean function, GPs come equipped with a useful property: the predictive variance function which provides confidence intervals for the predictions. The GP formulation usually assumes that there is no input noise in the training and testing points, only in the observations. However, this is often not the case in Earth observation problems where an accurate assessment of the instrument error is usually a…
Text Compression Using Antidictionaries
1999
International audience; We give a new text compression scheme based on Forbidden Words ("antidictionary"). We prove that our algorithms attain the entropy for balanced binary sources. They run in linear time. Moreover, one of the main advantages of this approach is that it produces very fast decompressors. A second advantage is a synchronization property that is helpful to search compressed data and allows parallel compression. Our algorithms can also be presented as "compilers" that create compressors dedicated to any previously fixed source. The techniques used in this paper are from Information Theory and Finite Automata.
Force-velocity profiling in athletes: Reliability and agreement across methods
2021
The aim of the study was to examine the test-retest reliability and agreement across methods for assessing individual force-velocity (FV) profiles of the lower limbs in athletes. Using a multicenter approach, 27 male athletes completed all measurements for the main analysis, with up to 82 male and female athletes on some measurements. The athletes were tested twice before and twice after a 2- to 6-month period of regular training and sport participation. The double testing sessions were separated by ~1 week. Individual FV-profiles were acquired from incremental loading protocols in squat jump (SJ), countermovement jump (CMJ) and leg press. A force plate, linear encoder and a flight time cal…
Qualitative Comparison of Community Detection Algorithms
2011
Community detection is a very active field in complex networks analysis, consisting in identifying groups of nodes more densely interconnected relatively to the rest of the network. The existing algorithms are usually tested and compared on real-world and artificial networks, their performance being assessed through some partition similarity measure. However, artificial networks realism can be questioned, and the appropriateness of those measures is not obvious. In this study, we take advantage of recent advances concerning the characterization of community structures to tackle these questions. We first generate networks thanks to the most realistic model available to date. Their analysis r…
Binary jumbled string matching for highly run-length compressible texts
2012
The Binary Jumbled String Matching problem is defined as: Given a string $s$ over $\{a,b\}$ of length $n$ and a query $(x,y)$, with $x,y$ non-negative integers, decide whether $s$ has a substring $t$ with exactly $x$ $a$'s and $y$ $b$'s. Previous solutions created an index of size O(n) in a pre-processing step, which was then used to answer queries in constant time. The fastest algorithms for construction of this index have running time $O(n^2/\log n)$ [Burcsi et al., FUN 2010; Moosa and Rahman, IPL 2010], or $O(n^2/\log^2 n)$ in the word-RAM model [Moosa and Rahman, JDA 2012]. We propose an index constructed directly from the run-length encoding of $s$. The construction time of our index i…
A Motzkin filter in the Tamari lattice
2015
The Tamari lattice of order n can be defined on the set T n of binary trees endowed with the partial order relation induced by the well-known rotation transformation. In this paper, we restrict our attention to the subset M n of Motzkin trees. This set appears as a filter of the Tamari lattice. We prove that its diameter is 2 n - 5 and that its radius is n - 2 . Enumeration results are given for join and meet irreducible elements, minimal elements and coverings. The set M n endowed with an order relation based on a restricted rotation is then isomorphic to a ranked join-semilattice recently defined in Baril and Pallo (2014). As a consequence, we deduce an upper bound for the rotation distan…